Predicting Maximum Data Staleness in Real-Time Warehouses

نویسندگان

  • Hennadiy Leontyev
  • Theodore Johnson
  • James H. Anderson
چکیده

This paper presents an analysis technique for estimating maximum data staleness in a data warehouse that collects “near-real-time” data streams. Data is pushed to the warehouse from a variety of external sources with a wide range of inter-arrival times (e.g., once a minute to once a day). In prior work, ad hoc heuristic algorithms have been proposed for scheduling warehouse updates. In this paper, global multiprocessor real-time scheduling algorithms are considered as an alternative. It is shown that schedulability results concerning such algorithms can be used to analytically derive upper bounds on maximum data staleness based upon characteristics of warehouse tables and the parameters of update tasks. Simulation experiments are presented that show the effectiveness of the proposed approach.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Reduction of Materialized View Staleness Using Online Updates

Updating the materialized views stored in data warehouses usually implies making the warehouse unavailable to users. We propose MAUVE , a new algorithm for online incremental view updates that uses timestamps and allows consistent read-only access to the warehouse while it being updated. The algorithm propagates the updates to the views more often than the typical once a day in order to reduce ...

متن کامل

HyDash: A Dashboard for Real-Time Business Intelligence based on the HyPer Main Memory Database System

Business Intelligence (BI) is a set of techniques that help improve business decision making. From a technical point of view, BI relies on a set of tools which includes performance dashboards: layered services that combine monitoring, analysis, and reporting. However, most dashboard solutions today are based on data warehouses which face a problem of data staleness—a circumstance caused by the ...

متن کامل

Probabilistically Bounded Staleness for Practical Partial Quorums

Modern storage systems employing quorum replication are often configured to use partial, non-strict quorums. These systems wait only for a subset of their replicas to respond to a request before returning an answer, without guaranteeing that read and write replica sets intersect. While these partial quorum mechanisms provide only basic eventual consistency guarantees, with no limit to the recen...

متن کامل

Defining and Measuring Data-Driven Quality Dimension of Staleness

With the growing complexity of data acquisition and processing methods, there is an increasing demand in understanding which data is outdated and how to have it as fresh as possible. Staleness is one of the key, time-related, data quality characteristics, that represents a degree of synchronization between data originators and information systems possessing the data. However, nowadays there is ...

متن کامل

Real Time Pseudo-Range Correction Predicting by a Hybrid GASVM model in order to Improve RTDGPS Accuracy

Differential base station sometimes is not capable of sending correction information for minutes, due to radio interference or loss of signals. To overcome the degradation caused by the loss of Differential Global Positioning System (DGPS) Pseudo-Range Correction (PRC), predictions of PRC is possible. In this paper, the Support Vector Machine (SVM) and Genetic Algorithms (GAs) will be incorpor...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009